oracle-db-examples/machine-learning/notebooks/sql/OML4SQL Feature Selection Algorithm Based.dsnb at main · oracle-samples/oracle-db-examples · GitHub

Name: oracle-db-examples/machine-learning/notebooks/sql/OML4SQL Feature Selection Algorithm Based.dsnb at main · oracle-samples/oracle-db-examples · GitHub
Rating: 4.8 (3427 reviews)
1
[{"layout":null,"template":null,"templateConfig":null,"name":"OML4SQL Feature Selection Algorithm Based","description":null,"readOnly":false,"type":"low","paragraphs":[{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":null,"title":null,"message":["%md"," "],"enabled":true,"result":null,"sizeX":0,"hideCode":true,"width":12,"hideResult":true,"dynamicFormParams":null,"row":0,"hasTitle":false,"hideVizConfig":true,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"html","title":null,"message":["%md","","# OML4SQL Feature Selection: Supervised Algorithm","In this notebook, we demonstrate how to perform feature selection using in-database supervised algorithms.","","We use the customer insurance lifetime value data set, which contains customer financial information, lifetime value, and whether or not the customer bought insurance.","","We first build a random forest model to predict if the customer will buy insurance, then use feature importance values for feature selection. ","","We then build a decision tree model for the same classification task and obtain split nodes. For the top splitting nodes with highest support, we select features associated with those nodes.","","The dataset `CUSTOMER_INSURANCE_LTV` is generated by the `\"OML Run-me-first\"` notebook, which `MUST` be run before this notebook.","","---","","###### `IMPORTANT`: The `\"OML Run-me-first\"` notebook is available under the menu Templates -> Examples and is a pre-requisite to the current notebook.","","---","","","Copyright (c) 2024 Oracle Corporation ","###### <a href=\"https://oss.oracle.com/licenses/upl/\" onclick=\"return ! window.open('https://oss.oracle.com/licenses/upl/');\">The Universal Permissive License (UPL), Version 1.0<\/a>","---"],"enabled":true,"result":null,"sizeX":0,"hideCode":true,"width":12,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":false,"hideVizConfig":true,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"html","title":"For more information ...","message":["%md","","* <a href=\"https://docs.oracle.com/en/cloud/paas/autonomous-data-warehouse-cloud/index.html\" target=\"_blank\">Oracle ADB Documentation<\/a>","* <a href=\"https://github.com/oracle-samples/oracle-db-examples/tree/main/machine-learning\" target=\"_blank\">OML folder on Oracle GitHub<\/a>","* <a href=\"https://www.oracle.com/machine-learning\" target=\"_blank\">OML Web Page<\/a>","* <a href=\"https://www.oracle.com/goto/ml-attribute-importance\" target=\"_blank\">OML Attribute Importance<\/a>","* <a href=\"https://www.oracle.com/goto/ml-random-forest\" target=\"_blank\">OML Random Forest<\/a>","* <a href=\"https://www.oracle.com/goto/ml-decision-tree\" target=\"_blank\">OML Decision Tree<\/a>"],"enabled":true,"result":null,"sizeX":0,"hideCode":true,"width":12,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":true,"hideVizConfig":true,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"table","title":"Display CUSTOMER_INSURANCE_LTV data","message":["%sql","","SELECT *","FROM CUSTOMER_INSURANCE_LTV","FETCH FIRST 10 ROWS ONLY"],"enabled":true,"result":null,"sizeX":0,"hideCode":false,"width":12,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":true,"hideVizConfig":false,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"html","title":"Examples of possible setting overrides for Attribute Importance","message":["%md","","If the user does not override the default settings, then relevant settings are determined by the algorithm.","","A complete list of shared settings can be found in the Documentation link:","","Shared Settings: <a href=\"https://docs.oracle.com/en/database/oracle/oracle-database/23/arpls/DBMS_DATA_MINING.html#GUID-24047A09-0542-4870-91D8-329F28B0ED75\" onclick=\"return ! window.open('https://docs.oracle.com/en/database/oracle/oracle-database/23/arpls/DBMS_DATA_MINING.html#GUID-24047A09-0542-4870-91D8-329F28B0ED75');\">All algorithms<\/a>","","--- ","","##### Other interesting overrides for data preparation","","Oracle Machine Learning supports fully Automatic Data Preparation (ADP), user-directed general data preparation, and user-specified embedded data preparation. The PREP_* settings enable the user to request fully automated or user-directed general data preparation. By default, fully Automatic Data Preparation (ON) is enabled. ","","A complete list of settings can be found in the Documentation link:","","Automatic Data Preparation: <a href=\"https://docs.oracle.com/en/database/oracle/oracle-database/23/arpls/DBMS_DATA_MINING.html#GUID-5043274C-C753-47DE-9E60-D8528ADAC78D\" onclick=\"return ! window.open('https://docs.oracle.com/en/database/oracle/oracle-database/23/arpls/DBMS_DATA_MINING.html#GUID-5043274C-C753-47DE-9E60-D8528ADAC78D');\">Automatic Data Preparation<\/a>","","This setting is used to specify Automatic Data Preparation."," The value required is either ON or OFF, and the default is ON. "," v_setlst('PREP_AUTO') := 'ON';"," ","The model uses heuristics to transform the build data according to the requirements of the algorithm. ","","##### User defined transformations","","Instead of fully ADP, the user can request that the data be shifted and/or scaled with the PREP_SCALE* and PREP_SHIFT* settings. The transformation instructions are stored with the model and reused whenever the model is applied. The model settings can be viewed in USER_MINING_MODEL_SETTINGS. ","","* This setting enables scaling data preparation for two-dimensional numeric columns. PREP_AUTO must be OFF for this setting to take effect. The following are the possible values:"," PREP_SCALE_STDDEV: A request to divide the column values by the standard deviation of the column and is often provided together with PREP_SHIFT_MEAN to yield z-score normalization."," PREP_SCALE_RANGE: A request to divide the column values by the range of values and is often provided together with PREP_SHIFT_MIN to yield a range of [0,1]."," v_setlst('PREP_SCALE_2DNUM') := 'PREP_SCALE_STDDEV';","","* This setting enables centering data preparation for two-dimensional numeric columns. PREP_AUTO must be OFF for this setting to take effect. The following are the possible values:"," PREP_SHIFT_MEAN: Results in subtracting the average of the column from each value."," PREP_SHIFT_MIN: Results in subtracting the minimum of the column from each value."," v_setlst('PREP_SHIFT_2DNUM') := 'PREP_SHIFT_MEAN';","","* This setting enables scaling data preparation for nested numeric columns. PREP_AUTO must be OFF for this setting to take effect. If specified, then the valid value for this setting is PREP_SCALE_MAXABS, which yields data in the range of [-1,1]."," v_setlst('PREP_SCALE_NNUM') := 'PREP_SCALE_MAXABS';","","---"],"enabled":true,"result":null,"sizeX":0,"hideCode":true,"width":12,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":true,"hideVizConfig":true,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"raw","title":"Use in-database Attribute Importance ","message":["\r","%script\r","\r","BEGIN DBMS_DATA_MINING.DROP_MODEL('AI_FS_MODEL');\r","EXCEPTION WHEN OTHERS THEN NULL; END;\r","/\r","DECLARE\r"," V_SETLST DBMS_DATA_MINING.SETTING_LIST;\r","BEGIN\r"," V_SETLST('ALGO_NAME') := 'ALGO_AI_MDL';\r"," V_SETLST('PREP_AUTO') := 'ON';\r","\r"," DBMS_DATA_MINING.CREATE_MODEL2(\r"," MODEL_NAME => 'AI_FS_MODEL',\r"," MINING_FUNCTION => 'ATTRIBUTE_IMPORTANCE',\r"," DATA_QUERY => 'SELECT * FROM CUSTOMER_INSURANCE_LTV',\r"," SET_LIST => V_SETLST,\r"," CASE_ID_COLUMN_NAME => 'CUSTOMER_ID',\r"," TARGET_COLUMN_NAME => 'BUY_INSURANCE');\r","END;\r"],"enabled":true,"result":null,"sizeX":0,"hideCode":false,"width":12,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":true,"hideVizConfig":false,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"table","title":"Check the model view output with attribute rankings","message":["%sql","","SELECT ATTRIBUTE_NAME, ATTRIBUTE_IMPORTANCE_VALUE, ATTRIBUTE_RANK ","FROM DM$VAAI_FS_MODEL ","WHERE ATTRIBUTE_IMPORTANCE_VALUE > 0","ORDER BY ATTRIBUTE_IMPORTANCE_VALUE DESC","FETCH FIRST 10 ROWS ONLY;"],"enabled":true,"result":null,"sizeX":0,"hideCode":false,"width":12,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":true,"hideVizConfig":false,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"bar","title":"Plot attributes by importance for predicting people who buy insurance","message":["%sql","","SELECT ATTRIBUTE_NAME, ATTRIBUTE_IMPORTANCE_VALUE, ATTRIBUTE_RANK ","FROM DM$VAAI_FS_MODEL ","WHERE ATTRIBUTE_IMPORTANCE_VALUE > 0","ORDER BY ATTRIBUTE_IMPORTANCE_VALUE DESC","FETCH FIRST 10 ROWS ONLY;"],"enabled":true,"result":null,"sizeX":0,"hideCode":false,"width":12,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":true,"hideVizConfig":false,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"html","title":null,"message":["%md","","#### Feature importance using Random Forest","---"],"enabled":true,"result":null,"sizeX":0,"hideCode":true,"width":12,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":false,"hideVizConfig":true,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"html","title":"Examples of possible setting overrides for Random Forest","message":["%md","","","If the user does not override the default settings, then relevant settings are determined by the algorithm.","","**NOTE:** Random Forest makes use of the Decision Tree settings to configure the construction of individual trees. ","","A complete list of settings can be found in the Documentation link:","","Algorithm Settings: <a href=\"https://docs.oracle.com/en/database/oracle/oracle-database/23/arpls/DBMS_DATA_MINING.html#GUID-481B6C67-B26E-4689-AD4C-98062D5A2117\" onclick=\"return ! window.open('https://docs.oracle.com/en/database/oracle/oracle-database/23/arpls/DBMS_DATA_MINING.html#GUID-481B6C67-B26E-4689-AD4C-98062D5A2117');\">Random Forest<\/a>","","Algorithm Settings: <a href=\"https://docs.oracle.com/en/database/oracle/oracle-database/23/arpls/DBMS_DATA_MINING.html#GUID-03435110-D723-42FD-B4EA-39C86A039566\" onclick=\"return ! window.open('https://docs.oracle.com/en/database/oracle/oracle-database/23/arpls/DBMS_DATA_MINING.html#GUID-03435110-D723-42FD-B4EA-39C86A039566');\">Decision Tree<\/a>","","Shared Settings: <a href=\"https://docs.oracle.com/en/database/oracle/oracle-database/23/arpls/DBMS_DATA_MINING.html#GUID-24047A09-0542-4870-91D8-329F28B0ED75\" onclick=\"return ! window.open('https://docs.oracle.com/en/database/oracle/oracle-database/23/arpls/DBMS_DATA_MINING.html#GUID-24047A09-0542-4870-91D8-329F28B0ED75');\">All algorithms<\/a>","","* Specify a row weight column "," v_setlst('ODMS_ROW_WEIGHT_COLUMN_NAME') := '<row_weight_column_name>';"," ","* Specify a missing value treatment method for the training data. This setting does not affect the scoring data. The default value is `ODMS_MISSING_VALUE_AUTO`. The option `ODMS_MISSING_VALUE_MEAN_MODE` replaces missing values with the mean (numeric attributes) or the mode (categorical attributes) both at build time and apply time where appropriate. The option `ODMS_MISSING_VALUE_AUTO` performs different strategies for different algorithms. When `ODMS_MISSING_VALUE_TREATMENT` is set to `ODMS_MISSING_VALUE_DELETE_ROW`, the rows in the training data that contain missing values are deleted. However, if you want to replicate this missing value treatment in the scoring data, then you must perform the transformation explicitly."," v_setlst('ODMS_MISSING_VALUE_TREATMENT') := 'ODMS_MISSING_VALUE_AUTO';","","##### These settings configure the behavior of the Random Forest algorithm ","","* Specify the number of trees in the forest. It requires a number between 1 and 65535 (including the edges). The default is 20."," v_setlst('RFOR_NUM_TREES') := '20';","","* Specify the fraction of the training data to be randomly sampled for use in the construction of an individual tree. The default is half of the number of rows in the training data. It requires a fraction between 0 and 1 (excluding the edges)"," v_setlst('RFOR_SAMPLING_RATIO') := '0.5';","","* Specify the size of the random subset of columns to be considered when choosing a split at a node. For each node, the size of the pool remains the same, but the specific candidate columns change. The default is half of the columns in the model signature. The special value 0 indicates that the candidate pool includes all columns. It requires a number equal to or greater than 0."," v_setlst('RFOR_MTRY') := '0';","","##### Decision Tree settings to configure the construction of individual trees"," ","* Specify Tree impurity metric for each Tree. "," Tree algorithms seek the best test question for splitting data at each node. The best splitter and split values are those that result in the largest increase in target value homogeneity (purity) for the entities in the node. Purity is by a metric. Decision trees can use either Gini `TREE_IMPURITY_GINI` or entropy `TREE_IMPURITY_ENTROPY` as the purity metric. By default, the algorithm uses `TREE_IMPURITY_GINI`."," v_setlst('TREE_IMPURITY_METRIC') := 'TREE_IMPURITY_GINI';"," ","* Specify the criteria for splits regarding the maximum tree depth (the maximum number of nodes between the root and any leaf node, including the leaf node)."," For Decision Tree, it requires a number between 2 and 20, and the default is 7. For Random Forest it is a number between 2 and 100, and the default is 16."," v_setlst('TREE_TERM_MAX_DEPTH') := '7';"," ","* Specify the minimum number of training rows in a node expressed as a percentage of the rows in the training data."," It requires a number between 0 and 10. The default is 0.05, indicating 0.05%. "," v_setlst('TREE_TERM_MINPCT_NODE') := '0.05';"," ","* Specify the minimum number of rows required to consider splitting a node expressed as a percentage of the training rows."," It requires a number greater than 0, and smaller or equal to 20. The default is 0.1, indicating 0.1%. "," v_setlst('TREE_TERM_MINPCT_SPLIT') := '0.1';","","* Specify The minimum number of rows in a node."," It requires a number greater than or equal to zero. The default is 10. "," v_setlst('TREE_TERM_MINREC_NODE') := '10';"," ","* Specify the criteria for splits regarding the minimum number of records in a parent node expressed as a value. "," No split is attempted if the number of records is below this value. It requires a number greater than 1. The default is 20. "," v_setlst('TREE_TERM_MINREC_SPLIT') := '20';"," ","* Specify the maximum number of bins for each attribute."," For Decision Tree it requires a number between 2 and 2,147,483,647, with the default value of 32. For Random Forest it requires a number between 2 and 254, with the default value of 32."," v_setlst('CLAS_MAX_SUP_BINS') := '32'; "," "],"enabled":true,"result":null,"sizeX":0,"hideCode":true,"width":12,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":true,"hideVizConfig":true,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"raw","title":"Build a Random Forest model for Feature Selection","message":["%script","","BEGIN DBMS_DATA_MINING.DROP_MODEL('RF_FEAT_SELECTION');"," EXCEPTION WHEN OTHERS THEN NULL; ","END;","/","DECLARE"," V_SETLST DBMS_DATA_MINING.SETTING_LIST;"," ","BEGIN"," V_SETLST('PREP_AUTO') := 'ON';"," V_SETLST('ALGO_NAME') := 'ALGO_RANDOM_FOREST';"," V_SETLST('RFOR_MTRY') := '3';"," V_SETLST('RFOR_NUM_TREES') := '25';"," V_SETLST('RFOR_SAMPLING_RATIO') := '0.5';"," "," DBMS_DATA_MINING.CREATE_MODEL2("," MODEL_NAME => 'RF_FEAT_SELECTION',"," MINING_FUNCTION => 'CLASSIFICATION',"," DATA_QUERY => 'SELECT * FROM CUSTOMER_INSURANCE_LTV',"," SET_LIST => V_SETLST,"," CASE_ID_COLUMN_NAME => 'CUSTOMER_ID',"," TARGET_COLUMN_NAME => 'BUY_INSURANCE'"," );","END;"],"enabled":true,"result":null,"sizeX":0,"hideCode":false,"width":12,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":true,"hideVizConfig":false,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"table","title":"Plot attributes by importance for predicting people who buy insurance","message":["%sql","","SELECT ATTRIBUTE_NAME, ATTRIBUTE_IMPORTANCE","FROM DM$VARF_FEAT_SELECTION ","ORDER BY ATTRIBUTE_IMPORTANCE DESC","FETCH FIRST 10 ROWS ONLY;"],"enabled":true,"result":null,"sizeX":0,"hideCode":false,"width":12,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":true,"hideVizConfig":false,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"html","title":null,"message":["%md","","### Feature importance using Decision Tree","---"],"enabled":true,"result":null,"sizeX":0,"hideCode":true,"width":12,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":false,"hideVizConfig":true,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"html","title":"Examples of possible setting overrides for DT","message":["%md","","If the user does not override the default settings, then relevant settings are determined by the algorithm.","","A complete list of settings can be found in the Documentation link:","","Algorithm Settings: <a href=\"https://docs.oracle.com/en/database/oracle/oracle-database/23/arpls/DBMS_DATA_MINING.html#GUID-03435110-D723-42FD-B4EA-39C86A039566\" onclick=\"return ! window.open('https://docs.oracle.com/en/database/oracle/oracle-database/23/arpls/DBMS_DATA_MINING.html#GUID-03435110-D723-42FD-B4EA-39C86A039566');\">Decision Tree<\/a>","","Shared Settings: <a href=\"https://docs.oracle.com/en/database/oracle/oracle-database/23/arpls/DBMS_DATA_MINING.html#GUID-24047A09-0542-4870-91D8-329F28B0ED75\" onclick=\"return ! window.open('https://docs.oracle.com/en/database/oracle/oracle-database/23/arpls/DBMS_DATA_MINING.html#GUID-24047A09-0542-4870-91D8-329F28B0ED75');\">All algorithms<\/a>","","* Specify a row weight column "," v_setlst('ODMS_ROW_WEIGHT_COLUMN_NAME') := '<row_weight_column_name>';"," ","* Specify a missing value treatment method for the training data. This setting does not affect the scoring data. The default value is `ODMS_MISSING_VALUE_AUTO`. The option `ODMS_MISSING_VALUE_MEAN_MODE` replaces missing values with the mean (numeric attributes) or the mode (categorical attributes) both at build time and apply time where appropriate. The option `ODMS_MISSING_VALUE_AUTO` performs different strategies for different algorithms. When `ODMS_MISSING_VALUE_TREATMENT` is set to `ODMS_MISSING_VALUE_DELETE_ROW`, the rows in the training data that contain missing values are deleted. However, if you want to replicate this missing value treatment in the scoring data, then you must perform the transformation explicitly."," v_setlst('ODMS_MISSING_VALUE_TREATMENT') := 'ODMS_MISSING_VALUE_AUTO';","","* Specify Tree impurity metric for Decision Tree. "," Tree algorithms seek the best test question for splitting data at each node. The best splitter and split values are those that result in the largest increase in target value homogeneity (purity) for the entities in the node. Purity is by a metric. Decision trees can use either Gini `TREE_IMPURITY_GINI` or entropy `TREE_IMPURITY_ENTROPY` as the purity metric. By default, the algorithm uses `TREE_IMPURITY_GINI`."," v_setlst('TREE_IMPURITY_METRIC') := 'TREE_IMPURITY_GINI';"," ","* Specify the criteria for splits regarding the maximum tree depth (the maximum number of nodes between the root and any leaf node, including the leaf node)."," For Decision Tree, it requires a number between 2 and 20, and the default is 7. For Random Forest it is a number between 2 and 100, and the default is 16."," v_setlst('TREE_TERM_MAX_DEPTH') := '7';"," ","* Specify the minimum number of training rows in a node expressed as a percentage of the rows in the training data."," It requires a number between 0 and 10. The default is 0.05, indicating 0.05%. "," v_setlst('TREE_TERM_MINPCT_NODE') := '0.05';"," ","* Specify the minimum number of rows required to consider splitting a node expressed as a percentage of the training rows."," It requires a number greater than 0, and smaller or equal to 20. The default is 0.1, indicating 0.1%. "," v_setlst('TREE_TERM_MINPCT_SPLIT') := '0.1';","","* Specify The minimum number of rows in a node."," It requires a number greater than or equal to zero. The default is 10. "," v_setlst('TREE_TERM_MINREC_NODE') := '10';"," ","* Specify the criteria for splits regarding the minimum number of records in a parent node expressed as a value. "," No split is attempted if the number of records is below this value. It requires a number greater than 1. The default is 20. "," v_setlst('TREE_TERM_MINREC_SPLIT') := '20';"," ","* Specify the maximum number of bins for each attribute."," For Decision Tree it requires a number between 2 and 2,147,483,647, with the default value of 32. For Random Forest it requires a number between 2 and 254, with the default value of 32."," v_setlst('CLAS_MAX_SUP_BINS') := '32'; "," ","---"],"enabled":true,"result":null,"sizeX":0,"hideCode":true,"width":12,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":true,"hideVizConfig":true,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"raw","title":"Build a Decision Tree model for Feature Selection","message":["%script","","BEGIN DBMS_DATA_MINING.DROP_MODEL('DT_FEAT_SELECTION');"," EXCEPTION WHEN OTHERS THEN NULL; ","END;","/","DECLARE"," V_SETLST DBMS_DATA_MINING.SETTING_LIST;"," ","BEGIN"," V_SETLST('PREP_AUTO') := 'ON';"," V_SETLST('ALGO_NAME') := 'ALGO_DECISION_TREE';"," V_SETLST('TREE_TERM_MAX_DEPTH') := '10';"," V_SETLST('TREE_TERM_MINPCT_NODE') := '0.01';"," "," DBMS_DATA_MINING.CREATE_MODEL2("," MODEL_NAME => 'DT_FEAT_SELECTION',"," MINING_FUNCTION => 'CLASSIFICATION',"," DATA_QUERY => 'SELECT * FROM CUSTOMER_INSURANCE_LTV',"," SET_LIST => V_SETLST,"," CASE_ID_COLUMN_NAME => 'CUSTOMER_ID',"," TARGET_COLUMN_NAME => 'BUY_INSURANCE'"," );","END;"],"enabled":true,"result":null,"sizeX":0,"hideCode":false,"width":12,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":true,"hideVizConfig":false,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"table","title":"Display Decision Tree nodes with top support","message":["%sql","","SELECT * ","FROM (SELECT NODE, "," NODE_SUPPORT, "," PREDICTED_TARGET_VALUE, "," PARENT,"," ATTRIBUTE_NAME,"," OPERATOR, "," SPLIT.VAL"," FROM DM$VODT_FEAT_SELECTION A,"," XMLTABLE( '/Element' PASSING A.VALUE"," COLUMNS"," VAL VARCHAR2(20) PATH '.') SPLIT "," ORDER BY 1,2,3,4,5,6,7)","ORDER BY NODE_SUPPORT DESC","FETCH FIRST 10 ROWS ONLY;"],"enabled":true,"result":null,"sizeX":0,"hideCode":false,"width":12,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":true,"hideVizConfig":false,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"table","title":"Select the attributes and their average support","message":["%sql","","SELECT DISTINCT ATTRIBUTE_NAME, "," ROUND(AVG(NODE_SUPPORT),2) AVG_SUPPORT","FROM (SELECT * "," FROM (SELECT NODE, "," NODE_SUPPORT, "," PREDICTED_TARGET_VALUE, "," PARENT,"," ATTRIBUTE_NAME,"," OPERATOR, "," SPLIT.VAL"," FROM DM$VODT_FEAT_SELECTION A,"," XMLTABLE( '/Element' PASSING A.VALUE"," COLUMNS"," VAL VARCHAR2(20) PATH '.') SPLIT "," ORDER BY 1,2,3,4,5,6,7)"," )"," GROUP BY ATTRIBUTE_NAME"," ORDER BY AVG_SUPPORT DESC"],"enabled":true,"result":null,"sizeX":0,"hideCode":false,"width":6,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":true,"hideVizConfig":false,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"table","title":"Select the attributes and their average support","message":["%sql","","SELECT DISTINCT ATTRIBUTE_NAME, "," ROUND(AVG(NODE_SUPPORT),2) AVG_SUPPORT","FROM (SELECT * "," FROM (SELECT NODE, "," NODE_SUPPORT, "," PREDICTED_TARGET_VALUE, "," PARENT,"," ATTRIBUTE_NAME,"," OPERATOR, "," SPLIT.VAL"," FROM DM$VODT_FEAT_SELECTION A,"," XMLTABLE( '/Element' PASSING A.VALUE"," COLUMNS"," VAL VARCHAR2(20) PATH '.') SPLIT "," ORDER BY 1,2,3,4,5,6,7)"," )"," GROUP BY ATTRIBUTE_NAME"," ORDER BY AVG_SUPPORT DESC"],"enabled":true,"result":null,"sizeX":0,"hideCode":false,"width":6,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":true,"hideVizConfig":false,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"html","title":null,"message":["%md","","# End of Script"],"enabled":true,"result":null,"sizeX":0,"hideCode":true,"width":12,"hideResult":false,"dynamicFormParams":null,"row":0,"hasTitle":false,"hideVizConfig":true,"hideGutter":true,"relations":[],"forms":"[]"},{"col":0,"visualizationConfig":null,"hideInIFrame":false,"selectedVisualization":"html","title":null,"message":["%md"],"enabled":true,"result":null,"sizeX":0,"hideCode":true,"width":12,"hideResult":true,"dynamicFormParams":null,"row":0,"hasTitle":false,"hideVizConfig":true,"hideGutter":true,"relations":[],"forms":"[]"}],"version":"6","snapshot":false,"tags":null}]